Tolerating Faults in a Mesh with a Row of Spare Nodes

نویسندگان

  • Jehoshua Bruck
  • Robert Cypher
  • C. T. Howard Ho
چکیده

Bruck, J., R. Cypher and C.-T. Ho, Tolerating faults in a mesh with a row ofspare nodes, Theoretical Computer Science 128 (1994) 241-252. We present an efficient method for tolerating faults in a two-dimensional mesh architecture. Our approach is based on adding spare components (nodes) and extra links (edges) such that the resulting architecture can be reconfigured as a mesh in the presence of faults. We optimize the cost of the fault-tolerant mesh architecture by adding about one row of redundant nodes in addition to a set of k spare nodes (while tolerating up to k node faults) and minimizing the number of links per node. Our results are surprisingly efficient and seem to be practical for small values of k. The degree of the fault-tolerant architecture is k + 5 for odd k, and k f6 for even k. Our results can be generalized to d-dimensional meshes.

برای دانلود رایگان متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Fault-Tolerant Meshes and Hypercubes with Minimal Numbers of Spares

Many parallel computers consist of processors connected in the form of a d-dimensional mesh or hypercube. Twoand three-dimensional meshes have been shown to be efficient in manipulating images and dense matrices, whereas hypercubes have been shown to be well suited to divide-andconquer algorithms requiring global communication. However, even a single faulty processor or communication link can s...

متن کامل

Fault-Tolerance in Augmented Hypercube Multicomputers

This paper describes different schemes for tolerating faults in augmented hypercube multiprocessors. The architectures considered have a spare assigned to each subset of nodes (cluster). The approaches make use of hardware redundancy in the form of spare nodes and/or links and usually requires modifications in the communication as well as computation algorithms.

متن کامل

Row/Column-First: A Path-based Multicast Algorithm for 2D Mesh-based Network on Chips

In this paper, we propose a new path-based multicast algorithm that is called Row/Column-First algorithm. The proposed algorithm constructs a set of multicast paths to deliver a multicast message to all multicast destination nodes. The set of multicast paths are all of row-first or column-first subcategories to maximize the multicast performance. The selection of row-first or column-first appro...

متن کامل

Disjoint Covers in Replicated Heterogeneous Arrays

Reconfigurable chips are fabricated with redundant elements that can be used to replace the faulty elements. The fault cover problem consists of finding an assignment of redundant elements to the faulty elements such that all of the faults are repaired. In reconfigurable chips that consist of arrays of elements, redundant elements are configured as spare rows and spare columns. This paper consi...

متن کامل

Design Methodologies for Tolerating Cell and Interconnect Faults in FPGAs

The very high levels of integration and submicron device sizes used in current and emerging VLSI technologies for FPGAs lead to higher occurrences of defects and operational faults. Thus, there is a critical need for fault tolerance and reconfiguration techniques for FPGAs to increase chip yields (with factory reconfiguration) and/or system reliability (with field reconfiguration). We first pro...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

عنوان ژورنال:
  • Theor. Comput. Sci.

دوره 128  شماره 

صفحات  -

تاریخ انتشار 1992